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Introduction 


Factor  analysis  is  a  data  reduction  technique  to  represent  a  set  of 
variables  in  a  lower  dimensional  space.  That  is.  the  purpose  of  factor  analyis 
is  to  represent  a  set  of  variables  as  a  linear  combination  of  a  smaller  set  of 
latent  variables,  namely,  factors  and  an  overall  mean.  Given  measurements  on  a 
set  of  variables,  the  analysis  results  in  the  estimate  of  factors,  i.e.,  factor 
scores,  the  regession  coefficients  of  each  variables  on  the  factors,  i.e., 
factor  loadings,  and  the  error  variances.  When  the  factors  are  assumed  to  be 
oblique  the  estimate  of  the  factor  correlations  is  also  provided. 

When  measurements  of  a  set  of  variables  are  available  from  several  data 
sources  we  could  analyze  each  data  matrix  independently  and  compare  the 
results.  This  approach  results  in  as  many  sets  of  factor  scores,  factor 
loadings  and  error  variances  estimates  as  the  number  of  data  sources.  A  more 
parsimonious  way  is  to  analyze  all  the  data  matrices  jointly,  with  some 
additional  assumptions,  reducing  the  number  of  parameters  to  be  estimated. 

Several  methods  have  been  proposed  to  accomplish  this  parsimony.  For 
example,  if  we  assume  that  all  the  data  sources  share  the  same  factor  loadings 
we  have  the  factorial  invariance  model  proposed  by  Lawley  and  Maxwell(1963)  and 
Meredith(1964a,  1964b).  A  general  estimation  procedure  under  this  model  is 
found  in  Joereskog(1971)  where  he  expresses  the  degrees  of  factor  invariance  in 
terms  of  the  strength  of  the  equality  assumption  among  parameters.  In  this 
approach  the  main  interest  is  to  test  whether  each  factor  loadings  matrix  is 
the  same  or  not.  That  is,  the  equality,  or  the  factor  invariance,  is  treated 
as  an  external  assumptions  to  be  tested  rather  than  as  part  of  structural 
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Another  approach,  proposed  by  Tttcker(1966)  and  later  by  Har$hman(1970) ,  ( 
also  Harshman  and  Londy(1984) ) ,  incorporates  soeie  equality  of  the  parameters  as 
an  essence  of  the  structural  model.  For  exanqile,  Harshman's  model,  called  the 
PARAFAC  model,  expresses  each  data  matrix  as  a  product  of  (common)  factor 
scores,  (coamon)  factor  loadings,  and  the  weight  matrix  which  represents  the 
differences  among  the  data  sources.  Tucker's  model,  called  the  Three  Node 
Factor  Analysis  (TWA)  model,  introduces  another  set  of  parameters  in  a  core 
matrix  which  describes  the  relationship  or  the  interactions  among  the  variables 
and  the  data  sources.  It  is  known,  Qarshman  and  Londy(1984),  that  the  PARAFAC 
model  is  a  special  case  of  Tucker's  TWA  model.  Several  estimation  procedures 
are  available  for  both  models.  For  the  PARAFAC  model,  Harshman(1970,  1972) 
provides  a  least  squares  estimation  method  and  Sands  and  Young(1980)  provide  a 
nonmetric  extension  of  the  least  squares  solution.  For  the  TWA  model,  least 
squares  estimation  is  provided  by  Krooneberg  and  de  Leeuw(1980)  and,  maximum 
likelihood  estimation,  by  Bentler  and  Lee(1978,  1979). 

One  advantage  of  the  PARAFAC  model  is  that  it  provides  an  unique  set  of 
factors  in  the  sense  that  there  are  no  degrees  of  freedom  for  rotation.  In 
contrast,  the  TWA  model,  though  it  provides  a  better  reproduction  of  the  data 
because  of  its  generality,  does  not  have  this  uniqueness  property.  Naturally, 
the  PARAFAC  model  is  more  parsimonious  and  easier  to  interpret.  Its 
disadvantage  is  that  is  will  generally  yield  larger  error  variances. 

However,  the  PARAFAC  model  is  not  usually  expressed  in  terms  of 
traditional  factor  analysis  terminology  and  not  yet  widely  used. 


The  purpose  of  this  paper  is  to  provide  a  traditional  factor  analytic  view  of 
the  PARAFAC  model  and  its  maximum  likelihood  estimation  procedure.  Also,  an 
extension  to  the  multimode  case  is  discussed. 


Model 


Suppose  that  there  are  p  variables,  s  data  sources,  N.  ,  k=l,2,...,s. 


observations  from  each  data  source,  and  r  factors.  The  PARAFAC  model  can  be 
expressed  in  terms  of  factor  scores,  factor  loadings,  data  source  weights  and 
error  terms  as  follows. 

(1)  *ik  =  5k  +  A  \  ^ik  +  -ik' 


i-1,2 . 1^,  k=l,2, . . . ,  s. 


where 

is  the  p  x  1  vector  of  observations  on  the  individual  i 


in  the  data  source  k, 

m^  is  the  p  x  1  grand  mean  vector  for  the  data  source  k, 

A  is  the  p  x  r  factor  loadings  matrix  common  to  all  the  data  sources, 
is  the  r  x  r  diagonal  weight  matrix  for  the  data  source  k, 

f is  the  r  x  1  factor  score  vector  of  individual  i 
-ik 

in  the  data  source  k, 

and 

e.^  is  the  p  x  1  error  vectors  of  individual  i 
in  the  data  source  k. 

Also,  the  parameters  associated  with  each  individual  are  assumed  to  have  the 


following  properties. 

(2)  E[  1^1=0,  1=1.2,...,^,  k=l,2 . s. 

D[  f^  ]  =  Cp,  1=1.2,...,^,  k=1.2 . s. 

«  su  1  -  2.  i-1.2 . k-1.2 . «. 

end 

D[  e^  ]  =  i=l,2, . . .  ,N^,  k=l,2,...,s, 

where  E  and  D  are.  respectively,  expectation  and  dispersion 
operators  and  D^,  k=l,2,...,s  is  issnaed  to  be  diagonal. 

Also,  it  is  assnmed  that  each  f  and  e  are  nncorrelated.  To  avoid  the 
multiplicative  indeterminacy  it  is  assumed  that  diag[  Cp  ]  =  and  the 
average  of  squared  matrices  is  equal  to  the  identity  matrix  of  size  r. 

The  additive  indetermiancy 

O)  -k  *  *  Wt  |k  ♦  A  Wk  (  -  fk  > 

or 

+  A  »  +  A  Wk  (  f.k  -  WkJ  m  ) 

is  taken  care  of  by  setting  the  mean  of  the  factor  scores  equal  to  zero. 

Usually  the  PAKAFAC  model  assumes  that  each  individual  is  measured  s  times 
on  the  same  set  of  variables,  (i.e.,  under  s  occasions),  but  here  we  assume 
that  each  data  source  has  different  set  of  individuals.  By  assuming  that  each 
factor  score  has  the  same  factor  dispersion  matrix,  C  ,  the  uniqueness 

r 

property  of  the  original  model  is  retained. 

In  terms  of  the  dispersion  matrix,  the  model  can  be  written  as 
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(4)  ck  =  A  Wk  Gp  wk  A'  +  Dk,  k=l,2, 

where  is  the  p  x  p  sample  dispersion  matrix  of  the  data  source  k. 

The  uniqueness  property  is  the  result  of  the  fact  that  after  any  nonsingular 
transformation  of  the  factors  the  model  cannot  be  expressed  as  the  product  of 
the  p  x  r  matrix,  the  r  x  r  diagonal  matrix  and  the  r  x  1  vector  without 
changing  the  goodness  of  fit.  That  is,  the  set  of  s  identities 

AWtf.  =AWU  T_1  T  f..  .  k=l ,  2 , . . . ,  s , 
k  -lk  k  -ik 

where  T  is  the  r  x  r  nonsingular  matrix  cannot  always  be  expressed  as 

B  (  Tf ..  ).  k=l,2, . . . ,s, 
k  -ik 

where  Vk,  k=l,2,...,s  is  a  diagonal  matrix. 

The  model  statated  above  can  be  interpreted  in  various  ways.  First,  by 
defining  a  new  variable 

(5)  z.k  =  Wkf.k,  i=l»2 . f^,  k-1,2 . s, 

we  have 

(6)  Zik  =  ^  +  A  z.k  +  2ik. 
or 

(7)  ck  =  A  czk  A'  +  V 

where  =  D[  z.k  I  =  \  %  \ . 

which  is  a  special  case  of  the  factorial  invariance  model  where  the  differences 
among  the  data  sources  are  expressed  in  terms  of  the  differences  among  the 
factor  dispersions  and  the  error  variances.  Interestingly,  this  form  of 
factorial  invariance  model  has  not  been  proposed  to  this  author's  knowledge. 

The  reason  seems  to  be  that  the  usual  selection  theorem  does  not  imply  this 


form  of  relationship  among  the  factor  correlation  matrices. 

Another  way  to  interpret  the  model  is  as  follows.  By  defining  a  new 
factor  loadings  matrix 

(8)  Bk  =  A  Wk,  k=l,2, . . . . s. 

we  have 

(9)  v  =  m.  +  B  f  +  e  , 
v  *ik  5k  k  Lik  sik* 

or 

(10)  Ck  =  Bk  Cp  Bk'  +  Dk. 

where  the  differences  are  explained  in  terms  of  the  differences  among  the 
factor  loadings  or  the  regression  coefficients  of  each  variable  on  the  factors. 
In  either  case,  the  elements  of  Wk  matrix  are  considered  to  be  the 

relative  weights/ importance  of  the  factors.  That  is,  when  the  (11)  element  of 

th 

the  Wk  matrix  has  a  relatively  high  value,  the  implication  is  that  the  1 

factor  is  relatively  more  important  than  the  rest  of  the  factors  in  the  k**1 
data  source.  However,  since  the  mean  of  each  factor  across  the  individuals  are 
set  equal  to  zero,  this  does  not  mean  that  those  variables  whose  factor 

loadings  are  high  on  the  1th  factor  have  higher  values.  Instead,  the  larger 
weight  generally  implies  larger  variances  of  those  variables  whose  factor 

loadings  are  high  on  the  l**1  factor.  For  example,  if  the  k**1  data  source 
is  highly  selective  on  the  basis  of  those  variables  whose  factor  loadings  are 

high  on  the  l1*1  factor,  we  should  expect  that  the  mean  of  those  variables  is 

tli  t  h 

high  and  that  the  weight  of  the  1  factor  in  the  k  data  source  is  low, 
resulting  in  the  high-mean  and  the  stnal 1 -variance  of  those  variables  in  the 


k  data  source. 


As  mentioned  before,  the  model  we  are  dealing  with  does  not  allow  us  to 
compare  the  factor  score  means  of  each  data  source.  When  it  is  desired,  a 
slightly  different  formulation  of  the  model  must  be  used.  Severel  methods 
which  enables  the  comparison  of  the  factor  score  means  will  be  discussed  in  the 
later  section. 


Extention  to  the  Four  Mode  Situation 

Suppose  that  each  data  source  can  be  expressed  as  a  combination  of  two 
categories.  For  example,  if  a  test  battery  is  administered  to  a  set  of 
individuals  we  could  divide  the  entire  sample  into  six  subsamples  defined  by 
sex  (  male,  female  )  and  race  (  black,  white,  other  ).  In  this  case,  it  may  be 
more  parsimonious  to  express  the  variations  among  the  data  sources  by  the 
product  of  two  weight  matices,  namely,  one  associated  with  sex  and  anther  with 
race . 

Generally,  if  the  s  data  sources  can  be  regarded  as  the  result  of  si  x  s2 
classification  we  could  write  that 
(11)  Wk  =  Wl^  12^,  kl=l,2, ....  si,  k2=l,2,...,s2, 

where  k  =  (kl-l)*s2  +  k2. 

The  idea  is  similar  to  the  usual  decomposition  of  the  cell  means  into  the 
combination  of  the  column  and  row  effect  in  si  x  s2  factorial  experimental 
design  where  the  decompot it  ion  is  additive  rather  than  multiplicative.  General 
form  of  this  decomposition  is  known  as  the  Canonical  Decomposition  of  N-Way 
Tables.  (See  Carrol  and  Pruzansky  1984).  Those  weight  matrices  can  be 


interpreted  similarly  as  before. 


If  it  is  desired  to  decompose  the  means  we  could  have  the  usual  ANOVA 
decomposition  such  as 

(12)  m^  =  m  +  .  kl=l,2, . . . ,sl,  k2=l,2, .. ,,s2, 

where  k  =  (kl-l)*s2  +  k2. 

The  Maximum  Likelihood  Estimation  by  the  EM  Algorithm 

With  the  additional  assumption  of  normality 

(13)  f  :  Nf(  0,  Cp  ),  i=l,2,...,Nk,k=l,2 . ,  iid, 

and 

e.L  :  N  (  0,  D  ),  i=l,2,...,R,  iid,  k=l,2,..,,s, 

-lk  p  -  k  r 

and  the  statistical  independence  of  the  factor  scores  and  the  error  terms,  we 
have 

(14)  Sk  :  wp<  V  N*  >•  k=l,2,...,s, 

where  is  the  sample  mean  corrected  SSCP  matrix, 

\  =  A  \  S  \  A'  +  Dk* 

and 

W  (  A,  df  )  denotes  the  p-variate  Wishart  distribution  with  the  degrees  of 
P 

freedom  df  and  the  mean  df  x  A. 

Here  the  parameter  m^,  k=l,2,...,s,  is  estimated  by  the  sample  swan  and 

treated  as  a  constant  when  deriving  the  Wishart  distributions. 

The  MLE  based  on  this  model  can  be  found  by  differenciating  the  product  of 
s  Wishart  likelihood  functions  with  respect  to  A,  Cp,  D^,  and  W^, 


k=1.2 


s.  However,  noticing  that  the  Wishart  likelihood  presented  above  is 


the  marginal  likelihood  of  f  ,  A,  Cp,  W^,  D^,  i=l,2, . . .  ,Nj_, 

k=l,2,...,s,  with  all  the  f^k's  integrated  out,  we  conld  instead  use  the 

following  EM  algorithm  where  the  factor  scores  are  treated  as  missing  data. 

This  approach,  originally  advocated  by  Rnbin  and  Thayer  (1982)  in  the  standard 
factor  analysis  context,  has  the  definite  advantage  of  simplicity  of  the 
calculation  involved  doe  to  the  linear  (tri-linear)  nature  of  the  complete  data 
likelihood. 

The  application  of  the  General  EM  algorithm  scheme  in  this  context  is 
outlined  below.  For  further  discussion  of  the  EM  algorithm  see  Mayekawa(198S) . 
In  the  E-step,  the  expectation  of  the  log  complete  data  likelihood  with  respect 
to  the  conditional  distribution  of  the  factor  scores  given  data,  factor 
loadings,  data  source  weights,  and  the  error  variances  is  calculated.  The 
complete  data  likelihood  is  given  by 

(15)  L  =  f(  Y  I  F,  A,  W,  D  )  i  f(  F  I  (^  ), 

where 

(16)  f(  Y  I  F.  A.  W,  D  )  = 

TC*=1[  f(  Yk  I  Fk,  A.  Wk.  Dk  )  J, 
where 

(17)  f(  Yk  I  Fk,  A,  Wk.  Dk  )  = 

1^1 [  Zik  I  £ik*  A.  Wk.  Dk  )  ]. 


where 


+  In  IdJ  +  constant  which  docs  not  involve  the  parameters. 
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and 

(19)  f(  F  |  Cp  )  =  1^=1E  f(  Fk  I  Cp  )  ], 
where 

(20)  -2  In  f(  Fk  |  Cp  )  = 


+  ^  x  lnlCpI 


+  constant. 

It  should  be  noted  that,  in  order  to  avoid  notational  complexity,  the  mean 
deviation  score,  —  n^,  is  denoted  by  in  the  above  expressions  and 

throughout  this  section.  Also,  the  x  p  column  centered  data  matrix  of  the 

data  source  k  is  denoted  by  Yk>  and  the  ^  x  r  factor  score  matrix,  by 

Fk* 

The  conditional  distribution  of  the  factor  score  is 

(21)  f^  I  Y,  A,  W.  D,  Cp  :  N.(  f*k>  V*  ). 

i  =  1.2, - 1^,  i.i.d,  k=l ,2 . . 

where 

(22)  V*  =  (  W^A'D"^  +  (J1  )_1,  k=l,2 . s, 

and 

(23)  F*  =  Yk  D”1  A  Vk,  k=l,2 . s, 

and  the  expectation  of  In  L  is  given  by  substituting  f*k  for  fik  in  (18) 


and  adding  a  term 


(24)  tr  WkA'D~1AWk  V* 
and 

\  «  S1  \ 

to  (18)  and  (20),  respectively.  The  result  can  be  expressed  as 

(25)  E[  In  L  ]  = 

tr[  (  Yk-F*WkA'  )  D"1  (  Yk-F*WkA'  )'  ] 

+  ^  In  |Dkl 

+  \  tr  wkA'DklAwkvk 
*  "  S1  \  tr  \ 

+  ^  In  ICpI  ] 

+  constant  which  doe  t  involve  the  parameters. 

Since  the  conditional  expectation  can  be  expressed  as  a  function  of 

*  •  *  * 

Fk’Fk  and  Yk'Fk>  the  x  r  matrices,  Fk's,  need  not  be 

stored  in  the  coarse  of  calculation.  Thus,  the  E-step  can  be  summarized  as 
The  E-step. 

(26)  V*  =  (  WjA'D^^  +  C^1  )_1.  k=l  ,2 . s, 

(27)  Y  »F*  =  Sk  D"1  A  Wk  V*,  k=l,2 . s, 

and 

(28)  F*'F*  =  V*  W.  A'  D'1  Y  'F*  k=l,2 . s. 


In  the  H-step  the  conditional  expectation  of  the  log  complete  data  likelihood 
is  aaxiaized  with  respect  to  A,  V^,  D^,  ,  k=l,2,...,s,  and  C^.  treating 

*  * 

and  as  constant. 

The  *i-step. 

(29)  Sj  *  VVVVl’VV’1 1 

lX‘i(  ,iFi'Jjk  '  V  ’•  j‘1-J . *’ 

(30)  d..  =  (  RSS  +  U  a.'lvVa.  )  /  N.  , 

jk  jk  k  -j  k  k  k-j  k 

j*l,2,...,p,  k=l,2, 

where  RSS  =  (  y  -F  a.  )'(  y  -F.a.  ), 
jk  *jk  k-j  ^jk  k-j 

(31)  Wk  =  diagt  c  ].  k=l,2, 
where  c  =  T  *  h. 


hi = V-i1  w  (YkVji  ’• 1-1 -2 . " 

s.  •  5Al  ’jiVjk 1 


X  (F*'F*  +  ^  V*)ta.  l.«rt.2 


t  •  •  •  #  r  9 


and 

(32)  Cp  -  <1/Nt>  F*’fJ  ♦  Np  V*  ). 

where  N+  =  ^*j[  1. 


The  above  fonnuli  are  derived  by  taking  the  partial  derivative  of  the 
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conditional  expectation  of  the  complete  log  likelihood  with  respect  to  each 
paraaieters  and  solving  the  resulting  normal  equations  independently  of  the 
normal  equations  for  the  rest  of  the  paraaieters.  Therefore,  strictly  speaking, 
they  do  not  provide  the  maximum  of  the  conditional  expectation  but  merely 
increase  the  value  of  the  conditional  expectation. 

In  order  to  avoid  multiplicative  redundancy,  the  diagonal  elements  of  the 
Cp  matrix  should  be  normalized  to  the  identity  matrix  and  the  matrix 

should  be  also  normalized  so  that  the  average  of  the  squared  matrices  is 

the  identity  matrix. 

As  in  standard  factor  analysis,  the  mean  vectors  and  the  SSCP  matrices  of 
each  data  source  are  sufficient  to  estimate  the  parameters  A,  W^,k=l,2, .. ,,s, 

and  D^,  k=l,2, . . . , s. 

Optionally,  we  could  enforce  soaie  equality  restrictions  such  as 

(33)  ^  =  D2  =•  ...»  D  -  D. 

(34)  Dk  =  ^  I  ,  k=1.2....,s, 
or 

(35)  Dj  =  D2  =.  ....  D  -  d. 

With  the  last  restriction  the  HE  is  equivalent  to  the  least  squares  estimates. 
Also,  when  the  factors  are  assumed  to  be  orthogonal  we  simply  skip  the 
estimation  of  Cp  holding  it  to  the  identity  matrix. 

The  M-step  for  the  W1  and  W2  matrices  can  be  derived  using  a  similar 
linearization  technique.  That  is,  noticing  that  the  conditional  distribution 
of  f  and  the  conditional  expectation  of  In  L  is  the  same  as  (21),  (22), 


(23)  and  (25)  with  matrix  and  the  subscript  k  defined  by  (11),  all  the 

E-step  and  the  M-step  except  for  (31)  remain  the  sue .  For  the  data  source 
weight  matrix,  (31)  should  be  modified  as  follows: 

(36)  fl^  =  diagt  c  ].  kl=l,2, ...,sl, 

•<“«  s  -  <  X£i‘  Tk2 1 1_1  Xf.i'  hz  >• 

hu  1  “  ll/djk(Ikl  k2  Fkl  u’jl1 

1=1.2, ...,r. 

*k2  lm  =  Ij=l[  *jlajmw2k2  11*^  mo/djk  ] 

1  (Fkl  k2  Fkl  k2  +  \l  k2  Vkl  k2)lm* 

1 ,m  =  1,2, ... ,r, 

and 

(37)  *2^  =  diagt  c  ],  k2=1.2. . . . ,s2. 

£  -  <  X£i[  Tki  1  >'*  Xin1  in  >' 

**u  I  =  Ij=llljlwlU  ll/djk(Ykl  k2  FU  k2J jl1 
1=1.2 . r. 

tkl  lm  =  lj=ll  *j 1* jmwlkl  llwlkl  «m/djk  1 

X  (Fkl  k2  Fkl  k2  +  \l  k2  Vkl  kl^m' 

1 ,m  =  1,2, ... ,r. 

Also,  normalizations  such  as  setting  the  average  squared  matrices  and  the 


average  squared  Wl^  matrices  to  the  identity  matrix  shall  be  enforced. 
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Initial  Conf iguration 

The  most  efficient  way  to  calculate  the  initial  configuration  seems  to  be 
the  application  of  the  SUMSCAL  algorithm  advocated  by  de  Leeuw  and 
Pruzansky(1978) .  The  method,  which  assumes  orthogonality  of  the  factors  and  is 
restricted  to  the  three  mode  situation,  has  been  used  in  Novick,  et.  al. 
(1983).  When  a  four  or  higher  mode  model  is  used,  the  log  additive 
decomposition  of  the  initial  matrices  should  provide  a  reasonable  estimate 

of  each  weight  matrix. 


Standardization  of  the  Raw  Data 

Since  the  MLE  has  a  nice  property  of  scale  invariance  we  may  be  able  to 
rescale  each  variable  to  a  desired  form.  Usual  practice  in  standard  factor 
analysis  is  to  scale  each  variable  so  that  each  has  zero  mean  and  unit 
variance.  However,  as  pointed  out  by  Joereskog(1971)  and  Harshman  and 
Lundy(1984),  standardization  within  each  data  source  changes  the  form  of  the 
likelihood.  That  is,  the  rescaling  must  be  performed,  after  subtracting  each 
within  data  source  mean,  by  multiplying  a  common  constant  across  all  the  data 
sources  to  each  variable.  The  most  convenient  approach  is  to  rescale  the 
variables  so  that  the  average  of  the  rescaled  dispersion  matrix  has  unit 
diagonal  elements.  The  number  of  individuals  may  or  may  not  be  used  to  weigh 


the  averaging  process. 


The  method  proposed  in  the  previous  sections  is  applied  to  a  subset  of 
ASVAB  Form  8. 

The  variables  analyzed  are: 

1.  General  Science  (GS) 

2.  Arithmetic  Reasoning  (AR) 

3.  Word  Knowledge  (WK) 

4.  Paragraph  Comprehension  (PC) 

5.  Numerical  Operations  (speeded)  (NO) 

6.  Coding  Speed  (speeded)  (CS) 

7.  Auto-Shop  Information  (AS) 

8.  Mathematics  Knowledge  (RK) 

9.  Mechanical  Comprehension  (MC) 

10.  Electronics  Information  (El) 

The  means  and  the  standard  deviations  are  shown  in  Table  1.  This  set  of 
variables  are  known  to  have  the  following  four  factors,  see,  for  example,  Ree, 
Mullins,  Mathews  and  Massey(1981) : 

1.  Verbal:  variables  1,  3,  4 

2.  Technical:  variables  7,  9,  10 

3.  Mathematical:  variables  2,  8 

4.  Speeded:  variables  3,  6 

It  is  also  known  that  these  factors  are  positively  correlated,  with  most 
correlation  in  the  .4  range. 

The  scores  of  these  ten  variables  are  available  for  the  following  six  data 
sources  which  are  the  combinations  of  two  different  armed  services,  (Marine 
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Corps,  Air  Force)  end  three  different  specialties,  (Clerical,  Mechanical, 
Electrical).  The  number  of  individuals  in  each  data  source  are: 

1.  MC  -  CLE  3285 

2.  MC  -  MEC  3118 

3.  MC  -  EL£  1415 

4.  AF  -  CLE  8963 

5.  AF  -  MEC  16884 

6.  AF  -  ELE  7897 

In  the  analysis  we  assume  only  that  there  are  four  factors  and  attempt  to 
demonstrate  that  the  usually  accepted  pattern  of  factor  loadings  can  be  found 
using  the  PARAFAC  model. 

The  six  sample  dispersion  matrices  are  first  rescaled  so  that  the  diagonal 
elements  of  the  weighted  average  are  equal  to  unity.  The  resulting  rescaled 
dispersion  matrices  are  shown  in  Table  2. 

The  four  mode  analysis  of  this  data  set  by  the  maximum  likelihood  method 
proposed  in  the  previous  sections,  with  r  =  4,  resulted  in  the  parameter 
estimates  shown  in  Table  3.  The  error  variances  are  assumed  to  be  equal  across 
the  data  sources.  The  diagonal  elements  of  each  W  matrix  are  arranged  to  form 
data  source  x  factor  matrix  in  Table  3. 

First,  it  should  be  noted  that,  without  any  rotation,  the  four  mode 
analysis  recovered  those  four  dimensions  found  by  the  standard  two  mode 
analysis.  According  to  Ree,  et.al. (1981) ,  we  could  name  the  first  factor 
Technical,  the  second.  Speeded,  the  third.  Verbal,  and  the  fourth. 

Mathematical.  The  major  difference  is  in  the  factor  dispersion  matrix:  their 
solutions  is  more  oblique  whereas  the  highest  factor  correlation  in  our 


eolation  it  about  0.2.  As  a  result,  the  Technical  factor,  first  factor,  is 
aiore  influential  than  their  corresponding  factor.  Once  again,  we  enphasi  ,e 
that  NO  rotations  are  performed  on  the  final  result. 

Second,  the  inspection  of  the  W1  matrices  confirms  the  fact  that  the  Air 
Force  is  more  selective  in  general.  This  is  shown  by  the  smaller  value  of  the 
W1  weight  matrix  which  represents  the  difference  between  the  Air  Force  and  the 
Marine  Corps.  In  particular,  the  Air  Force  is  highly  selective  on  the  Speeded 
Factor,  second  factor.  The  means  of  variables  5  and  6,  which  are  highly  loaded 
on  the  factor,  in  Table  1  shows  that  in  all  three  specialty  areas  their  means 
are  higher  than  those  of  the  Marine  Corps. 

Also,  the  W2  matrix  shows  that  the  mechanical  specialty  and  the  clerical 
specialty  area  has  a  smaller  weight  on,  respectively.  Technical  and  Speeded 
factors.  This,  confined  with  the  inspections  of  the  means  and  the  standard 
deviations  in  Table  1,  shows  that  mechanical  specialists  are  homogeneous  in 
those  variables  which  are  highly  loaded  on  the  Technical  factor  and  also  have 
higher  scores  on  those  variables.  The  same  argument  should  be  applied  to  the 
clerical  specialist  with  respect  to  the  Speeded  factors.  As  the  result,  the 
Air  Force  -  Clerical  specialtist  has  the  smallest  variances  of  the  variables  5 
and  6,  which  can  be  confirmed  by  Table  1  and  Table  2. 

Discussion 

In  this  section  we  discuss  a  slightly  different  formulation  of  the  model 
which  enables  us  to  compare  the  factor  score  means.  Consider  the  additive 
indeterminacy  in  (3).  The  formula  says  that  the  subtraction  of  the  mean  factor 
score,  f^,  can  be  compensated  by  the  addition  of  corresponding  quantity  to 
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In  the  previous  section  we  removed  this  redundancy  by  setting  the  factor 
score  mean  equal  to  zero.  This  approach  is  equivalent  to  defining  as  the 

mean  of  the  observation, 

(38)  E[  i  1  =  +  A  Wk  fk 

“  V 

The  reason  why  we  chose  to  use  this  method  is  that  this  is  the  usual  constraint 
in  standard  factor  analysis  where  the  factor  score  mean  is  not  of  interest. 
There  are,  however,  other  ways  to  remove  this  redundancy,  especially,  in  the 
PARAFAC  situation.  For  example,  we  could  set  all  the  m^’s  equal  to  zero, 

(39)  1  *  A  \  4. 

and  treat  4s  as  additional  parameters  to  be  estimated.  The  implication  of 

this  formulation  is  that,  within  each  data  source,  if  a  subset  of  variables  has 
the  similar  factor  loadings  they  must  have  similar  means.  That  is,  if  variable 

j  and  j*  have  identical  factor  loadings,  i.e.,  if  the  j**1  and  the  j '***  row 
of  the  A  matrix  are  identical,  their  mean  must  be  identical  within  each  data 
source.  This  may  not  be  a  realistic  assumption  in  practice.  For  example,  when 
a  test  and  its  half-test  is  analyzed  together  we  expect  that  the  means  of  the 
half-test  is  about  half  of  the  mean  of  the  full-test  while  expecting  that  the 
both  tests  have  similar  factor  loadings.  Another  way  to  reduce  the  redundancy 
is  to  assume 

(40)  E[  1  =  m  +  A  Wk  4* 

Since  there  are  some  redundancies  left  in  (40)  we  further  define  m  as  the  grand 
mean  across  all  the  data  sources.  (The  restriction  that  the  average  of  f ^  is 


equal  to  0  can  also  remove  the  remaining  redundancy.)  Note  that  this  is  more 

restrictive  than  (39)  since  (39)  does  not  enforce  any  structural  restrictions 
on  each  mean  while  (40)  assumes  that  each  mean  is  a  sum  of  the  grand  mean  and  a 
vector  which  lies  in  the  column  space  of  A  W^.  Therefore,  we  may  say  that 

this  approach  is  a  compromize  between  our  original  approach,  (38),  and  the  more 
restrictive  case  (39). 

The  analysis  under  this  assumption  may  be  perfomed  by  modifying  the 
conditional  distribution  of  f  in  (21)  and  resulting  expectation  of  the  log 

complete  data  likelihood.  The  factor  score  means  should  be  estimated  in  the 
M-step. 


Summary 

The  traditional  factor  analytic  view  of  the  PARAFAC  model  and  its 
extension  to  a  four  mode  situation  with  the  derivation  of  the  maxiaium 
likelihood  estimation  procedure  by  the  generelized  EM  algorithm  was  presented. 
The  four  mode  model  was  applied  to  six  data  matrices  defined  by  three 
specialty,  (clerical,  mechanical,  and  electrical),  times  two  services,  (Air 
Force  and  Marine  Corps)  and  successfully  recovered  the  usual  four  dimensional 
structure  without  any  rotation.  The  specialty  and  service  differences  was 
expressed  in  terms  of  different  weighting  of  the  common  factor  structure.  A 


model  which  allows  us  to  compare  the  factor  score  means  was  bIso  investigated. 
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Table  1.  Means  and  Standard  Deviations  of  the  Original  Variables 


Means 


MC-CLE 

MC-MEC 

MC-ELE 

AF-CLE 

AF-MEC 

AF-ELE 

GS 

16.940 

17.030 

17.850 

15.986 

17.332 

18.507 

AR 

20.490 

19.110 

20.260 

19.755 

19.715 

22.248 

WK 

28.140 

26.640 

27.900 

27.436 

27.175 

28.624 

PC 

11.490 

10.870 

11.120 

11.789 

11.550 

12.131 

NO 

41.170 

37.140 

37.590 

43.316 

38.167 

39.968 

CS 

55.070 

47.200 

47.700 

55.475 

47.160 

49.460 

AS 

15.970 

18.730 

17.030 

14.307 

18.921 

18.670 

MK 

14.810 

13.280 

15.100 

14.291 

13.509 

16.256 

MC 

15.800 

17.390 

16.510 

14.238 

17.244 

17.954 

El 

12.450 

13.650 

13.630 

11.599 

13.535 

14.394 

Standard  deviations 


MC-CLE 

MC-MEC 

MC-ELE 

AF-CLE 

AF-MEC 

AF-ELE 

GS 

3.832 

3.675 

3.683 

3.684 

3.335 

3.538 

AR 

5.609 

5.106 

5.441 

5.114 

4.981 

5.169 

IK 

4.888 

4.880 

4.863 

4.545 

4.569 

4.628 

PC 

2.503 

2.582 

2.766 

1.951 

2.098 

2.050 

NO 

8.090 

8.800 

9.010 

5.779 

7.483 

7.305 

CS 

15.021 

14.494 

15.008 

10.925 

11.648 

12.262 

AS 

4.928 

4.127 

4.700 

4.623 

3.737 

4.280 

MK 

4.944 

4.529 

4.859 

4.537 

4.428 

4.971 

MC 

4.643 

4.046 

4.533 

4.257 

3.732 

3.937 

El 

3.470 

3.005 

3.213 

3.178 

2.869 

3.021 

Number  of  observations 
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Table  2:  Sample  Dispersion  Matrices 


MC  -  CUE 


1 

2 

3 

4 

5 

6 

1 

1.178636 

0.517880 

0.746108 

0.579417 

0.119386 

0.079585 

2 

0.517880 

1.198937 

0.505769 

0.571901 

0.471982 

0.370015 

3 

0.746108 

0.505769 

1.112072 

0.708778 

0.164813 

0.124974 

4 

0.579417 

0.571901 

0.708778 

1.343676 

0.310126 

0.300254 

5 

0.119386 

0.471982 

0.164813 

0.310126 

1.215994 

0.701767 

6 

0.079585 

0.370015 

0.124974 

0.300254 

0.701767 

1.497889 

7 

0.667442 

0.492772 

0.429781 

0.426459 

0.151502 

0.153605 

8 

0.484964 

0.789714 

0.437370 

0.472698 

0.406709 

0.308891 

9 

0.652177 

0.644871 

0.470192 

0.514137 

0.227436 

0.273499 

10 

0.705888 

0.475099 

0.529311 

0.466305 

0.134747 

0.124527 

7 

8 

9 

10 

1 

0.667442 

0.484964 

0.652177 

0.705888 

2 

0.492772 

0.789714 

0.644871 

0.475099 

3 

0.429781 

0.437370 

0.470192 

0.529311 

4 

0.426459 

0.472698 

0.514137 

0.466305 

5 

0.151502 

0.406709 

0.227436 

0.134747 

6 

0.153605 

0.308891 

0.273499 

0.124527 

7 

1.371366 

0.342356 

0.868547 

0.885173 

8 

0.342356 

1.143990 

0.548740 

0.414558 

9 

0.868547 

0.548740 

1.335946 

0.795607 

10 

0.885173 

0.414558 

0.795607 

1.303493 
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Table  2  (continued) 

MC  -  MEC 


1 

2 

3 

4 

5 

6 

1 

1.083785 

0.339531 

0.673119 

0.552077 

0.062664 

0.029272 

2 

0.339531 

0.993451 

0.313733 

0.406963 

0.373296 

0.270903 

3 

0.673119 

0.313733 

1.108441 

0.666283 

0.115560 

0.110871 

4 

0.552077 

0.406963 

0.666283 

1.430571 

0.313665 

0.276246 

5 

0.062664 

0.37329 6 

0.115560 

0.313665 

1.438242 

0.721064 

6 

0.029272 

0.270903 

0.110871 

0.27624 6 

0.721064 

1.394635 

7 

0.418566 

0.231351 

0.310545 

0.289679 

-0.017714 

0.014750 

8 

0.382086 

0.602129 

0.350146 

0.392206 

0.330758 

0.228661 

9 

0.434441 

0.386068 

0.309557 

0.395292 

0.066320 

0.141343 

10 

0.494307 

0.230606 

0.394469 

0.375685 

-0.003811 

0.042069 

7 

8 

9 

10 

1 

0.418566 

0.382086 

0.434441 

0.494307 

2 

0.231351 

0.602129 

0.386068 

0.230606 

3 

0.310545 

0.350146 

0.309557 

0.394469 

4 

0.289679 

0.392206 

0.395292 

0.375685 

5 

-0.017714 

0.330758 

0.066320 

-0.003811 

6 

0.014750 

0.228661 

0.141343 

0.042069 

7 

0.961952 

0.146613 

0.533945 

0.537735 

8 

0.146613 

0.959664 

0.333154 

0.218805 

9 

0.533945 

0.333154 

1.014227 

0.491059 

10 

0.537735 

0.218805 

0.491059 

0.977577 

1 


Table  2  (continued) 


MC  - 

ELE 

1 

2 

3 

4 

5 

6 

1 

1.088970 

0.444932 

0.686065 

0.617761 

0.175045 

0.151668 

2 

0.444932 

1.128117 

0.462095 

0.590886 

0.512778 

0.433683 

3 

0. 586065 

0.462095 

1.100700 

0.807166 

0.220489 

0.225016 

4 

0.617761 

0.590886 

0.807166 

1.641387 

0.388635 

0.456438 

5 

0.175045 

0.512778 

0.220489 

0.388635 

1.508016 

0.766121 

6 

0.151668 

0.433683 

0.225016 

0.456438 

0.766121 

1.495214 

7 

0.574712 

0.458135 

0.447760 

0.508952 

0.207431 

0.202605 

8 

0.465069 

0.723645 

0.467525 

0.546951 

0.441771 

0.367036 

9 

0.574713 

0.601606 

0.474091 

0.593750 

0.293606 

0.314584 

10 

0.604843 

0.410017 

0.508383 

0.476367 

0.229347 

0.138149 

7 

8 

9 

10 

1 

0.574712 

0.46 5069 

0.574713 

0.604843 

2 

0.458135 

0.723645 

0.601606 

0.410017 

3 

0.447760 

0.467525 

0.474091 

0.508383 

4 

0.508952 

0.546951 

0.593750 

0.476367 

5 

0.207431 

0.441771 

0.293606 

0.229347 

6 

0.202605 

0.367036 

0.314584 

0.138149 

7 

1.247264 

0.323559 

0.777027 

0.684396 

8 

0.323559 

1.105020 

0.542661 

0.375034 

9 

0.777027 

0.542661 

1.272915 

0.694490 

10 

0.684396 

0.375034 

0.694490 

1.117525 
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Table  2  (continued) 


AF  -  CLE 


1 

2 

3 

4 

5 

6 

1 

1.089612 

0.374383 

0.616486 

0.386810 

-0.047248  -0.050883 

2 

0.374383 

0.996668 

0.264395 

0.259636 

0.162585 

0.101066 

3 

0.616486 

0.264395 

0.961422 

0.430905 

-0.082324 

-0.056227 

4 

0.386810 

0.259636 

0.430905 

0.816532 

-0.021440 

0.008320 

5 

-0.047248 

0.162585 

-0.082324  -0.021440 

0.620489 

0.272165 

6 

-0.050883 

0.101066 

-0.056227 

0.008320 

0.272165 

0.792366 

7 

0.545702 

0.399244 

0.367770 

0.276713 

-0.043238  -0.058872 

8 

0.361148 

0.613614 

0.251703 

0.238704 

0.170579 

0.093392 

9 

0.534856 

0.489634 

0.351526 

0.284509 

-0.016586  -0.013364 

10 

0.550750 

0.332561 

0.404871 

0.263472 

-0.050534 

-0.057021 

7 

8 

9 

10 

1 

0.545702 

0.361148 

0.534856 

0.550750 

2 

0.399244 

0.613614 

0.489634 

0.332561 

3 

0.367770 

0.251703 

0.351526 

0.404871 

4 

0.276713 

0.238704 

0.284509 

0.263472 

5 

-0.043238 

0.170579  -0.016586  -0.050534 

6 

-0.058872 

0.093392 

-0.013364  -0.057021 

7 

1.206954 

0.265190 

0.670281 

0.685928 

8 

0.265190 

0.963079 

0.436316 

0.288179 

9 

0.670281 

0.436316 

1.122904 

0.576577 

10 

0.685928 

0.288179 

0.576577 

1.093693 

I 


I 
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Table  2  (continued) 


AF  -  MEC 


1 

2 

3 

4 

5 

6 

1 

0.892798 

0.255790 

0.538381 

0.352900 

-0.010703 

0.000065 

2 

0.255790 

0.945362 

0.256308 

0.278090 

0.294504 

0.177461 

3 

0.538381 

0.256308 

0.971718 

0.468687 

0.001412 

0.054271 

4 

0.352900 

0.278090 

0.468687 

0.943868 

0.097563 

0.125340 

5 

-0.010703 

0.294504 

0.001412 

0.097563 

1.040331 

0.471375 

6 

0.000065 

0.177461 

0.054271 

0.125340 

0.471375 

0.90061 9 

7 

0.245481 

0.185494 

0.167392 

0.129370 

-0.082255 

-0.054124 

8 

0.282560 

0.556848 

0.261338 

0.270063 

0.307642 

0.185536 

9 

0.316671 

0.357519 

0.242488 

0.223441 

-0.003437 

0.033758 

10 

0.376656 

0.211314 

0.314900 

0.215375 

-0.067750  -0.041885 

7 

8 

9 

10 

1 

0.245481 

0.282560 

0.316671 

0.376656 

2 

0.185494 

0.556848 

0.357519 

0.211314 

3 

0.167392 

0.261338 

0.242488 

0.314900 

4 

0.129370 

0.270063 

0.223441 

0.215375 

5 

-0.082255 

0.307642 

-0.003437 

-0.067750 

6 

-0.054124 

0.185536 

0.033758  -0.041885 

7 

0.788617 

0.081958 

0.363429 

0.435066 

8 

0.081958 

0.917385 

0.306355 

0.184430 

9 

0.363429 

0.306355 

0.862953 

0.383867 

10 

0.435066 

0.184430 

0.383867 

0.891428 

Table  2  (continued) 


F  -  ELE 


12  3 

1  1.004770  0.429246  0.644426 

2  0.429246  1.018155  0.412641 

3  0.644426  0.412641  0.997157 

4  0.440293  0.395814  0.522813 

5  0.073658  0.352546  0.081628 

6  0.085264  0.289226  0.143281 

7  0.382372  0.248730  0.268558 

8  0.488072  0.751722  0.432582 

9  0.465788  0.431552  0.347703 
10  0.497701  0.332582  0.397299 

7  8  9 

1  0.382372  0.488072  0.465788 

2  0.248730  0.751722  0.431552 

3  0.268558  0.432582  0.347703 

4  0.210098  0.383563  0.296424 

5  -0.078564  0.386840  0.025306 

6  -0.057613  0.307277  0.059673 

7  1.034557  0.134562  0.548569 

8  0.134562  1.156241  0.405702 

9  0.548569  0.405702  0.960397 
10  0.560425  0.323375  0.515497 


4  5  6 

0.440293  0.073658  0.085264 
0.395814  0.352546  0.289226 
0.522813  0.081628  0.143281 
0.901530  0.139728  0.179852 
0.139728  0.991341  0.531685 
0.179852  0.531685  0.998183 
0.210098  -0.078564  -0.057613 
0.383563  0.386840  0.307277 
0.296424  0.025306  0.059673 
0.281620  -0.018622  -0.009183 

10 

0.497701 
0.332582 
0.397299 
0.281620 
-0.018622 
-0.009183 
0.560 425 
0.323375 
0.515497 
0.988374 


Table  3:  Parameter  Estimates 


A-MATRIX :  Factor  loadings  matrix 

12  3  4 


1  0.559643 

2  0.475841 

3  0.355282 

4  0.323135 

5  -0.008898 

6  -0.000931 

7  0.859566 

8  0.377196 

9  0.760223 
10  0.753870 


0.184060  -0.440180  -0.208001 
0.257108  0.103408  -0.492387 
0.305298  -0.721310  -0.283648 
0.332383  -0.371852  -0.235153 
0.791521  0.233699  -0.141539 
0.690510  0.120493  -0.060708 
0.155168  -0.063008  0.276114 
0.140648  0.129718  -0.715394 
0.126622  -0.019828  -0.069300 
0.138062  -0.184281  0.080616 


W 1-MATRIX:  Weight  matrix  associated  with  each  armed  services 
12  3  4 


1  1.054964  1.191168  1.016724  1.021623 

2  0.941834  0.762311  0.982992  0.977899 

W2-MATRIX :  Weight  matrix  associated  with  each  specialties 
12  3  4 


1  1.115465  0.797112  1.013535  0.943578 

2  0.857606  1.068602  1.035998  0.960787 

3  1.010074  1.105759  0.94839 6  1.089288 

W-MATRIX :  Product  of  W1  and  W2  matrices 


12  3  4 


1  1.176776  0.949494 

2  0.904743  1.272884 

3  1.065592  1.317145 

4  1.050583  0.607648 

5  0.807722  0.814607 

6  0.951322  0.842933 


1.030485  0.963982 
1.053324  0.981562 
0.964256  1.112842 
0.996296  0.922724 
1.018378  0.939553 
0.932265  1.065213 
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Table  3  (continued) 


D-MATRIX:  Error  variance* 


1  0.442958 

2  0.422579 

3  0.218548 

4  0.625186 

5  0.394725 

6  0.604116 

7  0.346037 

8  0.260786 

9  0.470492 
10  0.476524 


CF-MATKIX:  Factor  dispersion  matrix 

12  3  4 

1  1.000000  -0.102711  -0.023244  -0.136183 

2  -0.102711  1.000000  0.110366  -0.214670 

3  -0.023244  0.110366  1.000000  -0.011888 

4  -0.136183  -0.214670  -0.011888  1.000000 
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